Skip to content

Conversation

@vogelsgesang
Copy link
Member

@vogelsgesang vogelsgesang commented Oct 20, 2025

This commit adds move constructor and move assignment and swap
to exception_ptr. Adding those operators allows us to avoid
unnecessary calls to __cxa_{inc,dec}rement_refcount.

Performance results:

TODO: update

@github-actions
Copy link

github-actions bot commented Oct 20, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we instead add a swap and implement operator= as swap(exception_ptr())? That would avoid having to introduce any new symbols, at least in this patch.

@vogelsgesang vogelsgesang force-pushed the avogelsgesang-exceptionptr-move branch from b595542 to 154c286 Compare October 21, 2025 12:18
Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM, just some nits.

@vogelsgesang
Copy link
Member Author

I updated this commit to use implement the move-assignment using swap now:

_LIBCPP_HIDE_FROM_ABI inline exception_ptr& exception_ptr::operator=(exception_ptr&& __other) _NOEXCEPT {
  exception_ptr __tmp(std::move(__other));
  swap(__tmp, *this);
  return *this;
}

Is that what you had in mind?

Benchmark results (original implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.62          0.26            2.13
bm_exception_ptr_copy_assign_null                     3.97         4.06          0.09            2.23
bm_exception_ptr_copy_ctor_nonnull                   11.67        11.97          0.30            2.59
bm_exception_ptr_copy_ctor_null                       4.77         4.86          0.09            1.80
bm_exception_ptr_move_assign_nonnull                 24.07        14.28         -9.78          -40.65
bm_exception_ptr_move_assign_null                     3.98         2.42         -1.56          -39.17
bm_exception_ptr_move_copy_swap_nonnull              51.74        40.15        -11.59          -22.40
bm_exception_ptr_move_copy_swap_null                 32.63        23.07         -9.56          -29.30
bm_exception_ptr_move_copy_swap_null_optimized       30.50        21.15         -9.35          -30.66
bm_exception_ptr_move_ctor_nonnull                   23.44        14.38         -9.06          -38.65
bm_exception_ptr_move_ctor_null                       4.67         2.63         -2.05          -43.78
bm_exception_ptr_swap_nonnull                        33.51         2.64        -30.88          -92.13
bm_exception_ptr_swap_null                            7.88         2.38         -5.50          -69.82

Benchmark results (swap-based implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.85          0.49            3.95
bm_exception_ptr_copy_assign_null                     3.97         4.04          0.07            1.70
bm_exception_ptr_copy_ctor_nonnull                   11.67        12.15          0.49            4.17
bm_exception_ptr_copy_ctor_null                       4.77         4.80          0.03            0.61
bm_exception_ptr_move_assign_nonnull                 24.07        15.04         -9.02          -37.49
bm_exception_ptr_move_assign_null                     3.98         4.71          0.73           18.40
bm_exception_ptr_move_copy_swap_nonnull              51.74        39.24        -12.49          -24.15
bm_exception_ptr_move_copy_swap_null                 32.63        21.80        -10.83          -33.19
bm_exception_ptr_move_copy_swap_null_optimized       30.50        19.88        -10.63          -34.84
bm_exception_ptr_move_ctor_nonnull                   23.44        14.46         -8.98          -38.30
bm_exception_ptr_move_ctor_null                       4.67         2.58         -2.09          -44.81
bm_exception_ptr_swap_nonnull                        33.51         1.16        -32.36          -96.55
bm_exception_ptr_swap_null                            7.88         1.18         -6.70          -85.04

Interpretation of benchmark results

I consider the bm_exception_ptr_copy_* regressions to be noise.
For bm_exception_ptr_move_ctor_*, both variants perform the same.
The main difference is in bm_exception_ptr_move_assign_*, as expected.

In particular bm_exception_ptr_move_assign_null even regresses compared to main. This is probably due to the additional __tmp inside operator=. The destructor of the original __other is a no-op after the move because __other.__ptr will be a nullptr. However, the compiler does not realize this, since the destructor is not inlined and is lacking a fast-path. As such, the swap-based implementation leads to an additional destructor call.

The bm_exception_ptr_move_assign_nonnull still benefits because the swap-based move constructor avoids unnecessary __cxa_{in,de}crement_refcount calls.

As soon as we inline the destructor, this regression should disappear again. As such, I think we can live with that temporary regression - WDYT?

@philnik777
Copy link
Contributor

I updated this commit to use implement the move-assignment using swap now:

_LIBCPP_HIDE_FROM_ABI inline exception_ptr& exception_ptr::operator=(exception_ptr&& __other) _NOEXCEPT {
  exception_ptr __tmp(std::move(__other));
  swap(__tmp, *this);
  return *this;
}

Is that what you had in mind?

Yes, exactly.

Benchmark results (original implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.62          0.26            2.13
bm_exception_ptr_copy_assign_null                     3.97         4.06          0.09            2.23
bm_exception_ptr_copy_ctor_nonnull                   11.67        11.97          0.30            2.59
bm_exception_ptr_copy_ctor_null                       4.77         4.86          0.09            1.80
bm_exception_ptr_move_assign_nonnull                 24.07        14.28         -9.78          -40.65
bm_exception_ptr_move_assign_null                     3.98         2.42         -1.56          -39.17
bm_exception_ptr_move_copy_swap_nonnull              51.74        40.15        -11.59          -22.40
bm_exception_ptr_move_copy_swap_null                 32.63        23.07         -9.56          -29.30
bm_exception_ptr_move_copy_swap_null_optimized       30.50        21.15         -9.35          -30.66
bm_exception_ptr_move_ctor_nonnull                   23.44        14.38         -9.06          -38.65
bm_exception_ptr_move_ctor_null                       4.67         2.63         -2.05          -43.78
bm_exception_ptr_swap_nonnull                        33.51         2.64        -30.88          -92.13
bm_exception_ptr_swap_null                            7.88         2.38         -5.50          -69.82

Benchmark results (swap-based implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.85          0.49            3.95
bm_exception_ptr_copy_assign_null                     3.97         4.04          0.07            1.70
bm_exception_ptr_copy_ctor_nonnull                   11.67        12.15          0.49            4.17
bm_exception_ptr_copy_ctor_null                       4.77         4.80          0.03            0.61
bm_exception_ptr_move_assign_nonnull                 24.07        15.04         -9.02          -37.49
bm_exception_ptr_move_assign_null                     3.98         4.71          0.73           18.40
bm_exception_ptr_move_copy_swap_nonnull              51.74        39.24        -12.49          -24.15
bm_exception_ptr_move_copy_swap_null                 32.63        21.80        -10.83          -33.19
bm_exception_ptr_move_copy_swap_null_optimized       30.50        19.88        -10.63          -34.84
bm_exception_ptr_move_ctor_nonnull                   23.44        14.46         -8.98          -38.30
bm_exception_ptr_move_ctor_null                       4.67         2.58         -2.09          -44.81
bm_exception_ptr_swap_nonnull                        33.51         1.16        -32.36          -96.55
bm_exception_ptr_swap_null                            7.88         1.18         -6.70          -85.04

Interpretation of benchmark results

I consider the bm_exception_ptr_copy_* regressions to be noise. For bm_exception_ptr_move_ctor_*, both variants perform the same. The main difference is in bm_exception_ptr_move_assign_*, as expected.

In particular bm_exception_ptr_move_assign_null even regresses compared to main. This is probably due to the additional __tmp inside operator=. The destructor of the original __other is a no-op after the move because __other.__ptr will be a nullptr. However, the compiler does not realize this, since the destructor is not inlined and is lacking a fast-path. As such, the swap-based implementation leads to an additional destructor call.

The bm_exception_ptr_move_assign_nonnull still benefits because the swap-based move constructor avoids unnecessary __cxa_{in,de}crement_refcount calls.

As soon as we inline the destructor, this regression should disappear again. As such, I think we can live with that temporary regression - WDYT?

Yeah, I think that's fine.

Also, please update the commit message to mention that we introduce swap as well.

@vogelsgesang
Copy link
Member Author

vogelsgesang commented Oct 21, 2025

Yeah, I think that's fine.

👍 then I think we have high-level alignment on this PR. The next step will be to polish it for final review.

Also, please update the commit message to mention that we introduce swap as well.

I will do so after #164278 shipped, so I won't have to repeatedly rebase this PR and redo the measurements anymore

One more question:
The build currently fails due to #define move SYSTEM_RESERVED_NAME.
Should I simply add _LIBCPP_PUSH_MACROS #include <__undef_macros>? Or is there some other recommended solution?

@philnik777
Copy link
Contributor

Yeah, I think that's fine.

👍 then I think we have high-level alignment on this PR. The next step will be to polish it for final review.

Also, please update the commit message to mention that we introduce swap as well.

I will do so after #164278 shipped, so I won't have to repeatedly rebase this PR and redo the measurements anymore

One more question: The build currently fails due to #define move SYSTEM_RESERVED_NAME. Should I simply add _LIBCPP_PUSH_MACROS #include <__undef_macros>? Or is there some other recommended solution?

Yeah, that should fix the CI.

This commit adds move constructor and move assignment to
`exception_ptr`. Adding those operators allows us to avoid unnecessary
calls to `__cxa_{inc,dec}rement_refcount`.

Performance results:

```
Benchmark                          Baseline    Candidate    Difference    % Difference
-------------------------------  ----------  -----------  ------------  --------------
bm_nonnull_exception_ptr              52.22        40.92        -11.31          -21.65
bm_null_exception_ptr                 31.41        23.29         -8.12          -25.85
bm_optimized_null_exception_ptr       28.69        20.50         -8.19          -28.55
```

This commit does not add a `swap` specialization. Thanks to the added
move-assignment, we already save a couple of increments/decrements also
in the default `swap` implementation. The default `swap` is still not
perfect, as it calls the desctructor on `tmp`. As soon as we also
inlined the `~exception_ptr` destructor fast-path for `__ptr ==
nullptr`, the optimizer should be able to optimize the default `swap`
just as well as a specialized `swap`, though.
@vogelsgesang vogelsgesang force-pushed the avogelsgesang-exceptionptr-move branch from 154c286 to 0fcaeee Compare October 27, 2025 20:10
@vogelsgesang vogelsgesang force-pushed the avogelsgesang-exceptionptr-move branch from 0fcaeee to b106dc9 Compare October 27, 2025 20:16
@vogelsgesang
Copy link
Member Author

vogelsgesang commented Oct 27, 2025

/libcxx-bot benchmark libcxx/test/benchmarks/exception_ptr.bench.cpp

Benchmark results:
Benchmark                               Baseline    Candidate    Difference    % Difference
------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull        9.77         9.94          0.18           1.79%
bm_exception_ptr_copy_assign_null          10.29        10.65          0.35           3.42%
bm_exception_ptr_copy_ctor_nonnull          7.02         7.01         -0.01          -0.13%
bm_exception_ptr_copy_ctor_null            10.54        10.60          0.06           0.56%
bm_exception_ptr_move_assign_nonnull       16.92        13.76         -3.16         -18.70%
bm_exception_ptr_move_assign_null          10.61        10.76          0.14           1.36%
bm_exception_ptr_move_ctor_nonnull         13.31        10.25         -3.06         -23.02%
bm_exception_ptr_move_ctor_null            10.28         7.30         -2.98         -28.95%
bm_exception_ptr_swap_nonnull              19.22         0.63        -18.59         -96.74%
bm_exception_ptr_swap_null                 20.02         7.79        -12.23         -61.07%
bm_make_exception_ptr/threads:1            32.30        32.44          0.14           0.43%
bm_make_exception_ptr/threads:2            16.17        16.34          0.17           1.07%
bm_make_exception_ptr/threads:4             8.32         8.40          0.07           0.90%
bm_make_exception_ptr/threads:8             4.19         4.18         -0.00          -0.11%
Geomean                                    12.00         8.34         -3.65         -30.46%

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some basic tests for the new functions? I'm pretty sure everything should be testable portably to some extent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants